convergence and robustness
On the Convergence and Robustness of Training GANs with Regularized Optimal Transport
Generative Adversarial Networks (GANs) are one of the most practical methods for learning data distributions. A popular GAN formulation is based on the use of Wasserstein distance as a metric between probability distributions. Unfortunately, minimizing the Wasserstein distance between the data distribution and the generative model distribution is a computationally challenging problem as its objective is non-convex, non-smooth, and even hard to compute. In this work, we show that obtaining gradient information of the smoothed Wasserstein GAN formulation, which is based on regularized Optimal Transport (OT), is computationally effortless and hence one can apply first order optimization methods to minimize this objective. Consequently, we establish theoretical convergence guarantee to stationarity for a proposed class of GAN optimization algorithms. Unlike the original non-smooth formulation, our algorithm only requires solving the discriminator to approximate optimality. We apply our method to learning MNIST digits as well as CIFAR-10 images. Our experiments show that our method is computationally efficient and generates images comparable to the state of the art algorithms given the same architecture and computational power.
On the Convergence and Robustness of Training GANs with Regularized Optimal Transport
Generative Adversarial Networks (GANs) are one of the most practical methods for learning data distributions. A popular GAN formulation is based on the use of Wasserstein distance as a metric between probability distributions. Unfortunately, minimizing the Wasserstein distance between the data distribution and the generative model distribution is a computationally challenging problem as its objective is non-convex, non-smooth, and even hard to compute. In this work, we show that obtaining gradient information of the smoothed Wasserstein GAN formulation, which is based on regularized Optimal Transport (OT), is computationally effortless and hence one can apply first order optimization methods to minimize this objective. Consequently, we establish theoretical convergence guarantee to stationarity for a proposed class of GAN optimization algorithms.
Reviews: On the Convergence and Robustness of Training GANs with Regularized Optimal Transport
SUMMARY The authors investigate the task of training a Generative Adversarial Networks model based on optimal transport (OT) loss. They focus on regularized OT losses, and show that approximate gradients of these losses can be obtained by approximately solving regularized OT problem (Thm 4.1). As a consequence, a non-convex stochastic gradient method for minimizing this loss has a provable convergence rate to stationarity (Thm 4.2). The analysis also applies to Sinkhorn losses. The authors then explore numerically the behavior of a practical algorithm where the dual variable are parametrized by neural networks (the theory does not immediately apply because estimating the loss gradient becomes non-convex).